Clustering With Outlier Removal
نویسندگان
چکیده
Cluster analysis and outlier detection are two continuously rising topics in data mining area, which fact connect to each other deeply. structure is vulnerable outliers; inversely, outliers the points belonging none of any clusters. Unfortunately, most existing studies do not notice coupled relationship between these tasks handle them separately. In this article, we consider joint cluster problem, propose Clustering with Outlier Removal (COR) algorithm. Specifically, original space transformed into a binary via generating basic partitions. We employ Holoentropy measure compactness without involving several candidates. To provide neat efficient solution, an auxiliary matrix introduced so that COR completely efficiently solves challenging problem unified K-means— theoretical supports. Extensive experimental results on numerous sets various domains demonstrate effectiveness efficiency significantly over state-of-the-art methods terms validity detection. Some key factors including partition number generation strategy application abnormal flight trajectory further analyzed for practical use.
منابع مشابه
Clustering with Outlier Removal
Cluster analysis and outlier detection are strongly coupled tasks in data mining area. Cluster structure can be easily destroyed by few outliers; on the contrary, the outliers are defined by the concept of cluster, which are recognized as the points belonging to none of the clusters. However, most existing studies handle them separately. In light of this, we consider the joint cluster analysis ...
متن کاملDCBOR: A Density Clustering Based on Outlier Removal
Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultane...
متن کاملImproved Hybrid Clustering and Distance-based Technique for Outlier Removal
Outliers detection is a task that finds objects that are dissimilar or inconsistent with respect to the remaining data. It has many uses in applications like fraud detection, network intrusion detection and clinical diagnosis of diseases. Using clustering algorithms for outlier detection is a technique that is frequently used. The clustering algorithms consider outlier detection only to the poi...
متن کاملAlgorithms for optimal outlier removal
We consider the problem of removing c points from a set S of n points so that the remaining point set is optimal in some sense. Definitions of optimality we consider include having minimum diameter, having minimum area (perimeter) bounding box, having minimum area (perimeter) convex hull. For constant values of c, all our algorithms run in O(n log n) time.
متن کاملOnline Clustering and Outlier Detection
Clustering and outlier detection are important data mining areas. Online clustering and outlier detection generally work with continuous data streams generated at a rapid rate and have many practical applications, such as network instruction detection and online fraud detection. This chapter first reviews related background of online clustering and outlier detection. Then, an incremental cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2021
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2019.2954317